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Aw SFLICE AND DATA [ICTIONAKEY 


Tre SELICE (cock Fcint Logistics Integrated 
Communicaticn Envirorgent) cencept comes as a result cf tne 
always crowing demards of the U.S Navy for automated data 
processing [Ref. 1] ard inventcry control at various fecints. 
A design and implemertation strategy is necesSary f[ased in 
distrikuted architecttre for a lccal area network (LAN). 

SFIICE is designed to increase ADP facilities cf the 
existing Navy stock point and inventory contrcl feceirt. 
Eecause the current Unifcrn Automated Data Prcecéssing 
Syster-Stock Points cannot suprort the growing reguirezents 
for autcmated data freccessing (ALP) without a tctal rede- 
Sign, an effort has keen undertaken to improve the system in 
the shcrt and long term [Ref. 1]. TwO Major objectives are 
kehind tke SFLICE development: 

1. Ic increase C&hI display terminals so users can access 
luteractively the system's cata Lase. 

Ze. Ic standardize the various current interfaces acrcss 
the €z suprly sites. 

The design approach first starts from the desicnirg cf 
mene icgical or virtual Local Area Network {LAN), Ey sreci- 
fying ali tke functicnal modules, their characteristics, and 
the cc@@unication pretocols without focusing on the harécware 
characteristics. A dater phase of the SPLICE project will 
anticipate the mappinc of the virtual LAN reguirezerts crto 
a physical local netwerk. 


The f£cllowing functional modules are involved 2a 


Ccevelcrpaent of the systen. 
- Liccal conmunications {iC} 
- Naticrnal commurications (NC) 
= FEGnt-End ~Erocessinc (tee) 
- Tergiral managezent (1M) 
- Data kase ranagenrent (LES) 
- Sessicn services (SS) 
- Feripkeral manécement (FM) 


~ Rescurce allocation (fA) 

This LAN design provides for distributed control aa. 
does net previde for the distribution of data bases witlhir a 
IAN. Tke cata bases cf the SFLICE system are geografhicaily 
distrikuted over a wide area and for the purpose cf fain- 
taining tke integrity cf the system, the data base functicns 
are centralized within each IAN. A DBMS module fcr the 
syster gust at least provide dictronany, integrity, 
Tecovery, guery lancuage, and security features as well as 
compatikility with existing CCEOL programs. 

The functions of the DBM nodule would be: 


- Catalog, to fmgaintain a catalog o£ file panes "ang 
Status (rare, open cr closed, size, physical adcress of 
file,physical address cr index, application used in, date 
entered intc systen, expiration date if any, loca tichaes 


Tackuf ccpy, format, access restriction). 


- Operations, under a menu selection scheme to perfern 
varicus furctions (retrieve and display a record, urgdate 
specifiec fields of a record, delete a reccrd, insert a 


record, print a file, print a récord or Svecified wtieldsue 


1¢ 


a record, arswer specified gueries and display and frint the 


results). 


> PmeGt mena = LOL dekiningmeana characterizing the data 
élemerts. The dicticnary must be integrated with the [CEKFS. 
Mis will contribute to data integrity and consistercy 
throughout the system and should also be of great assistarce 
miedgesiching repert fcrmats. 

With this imyroved design it is believed that the SELICE 
system will provide economical and responsive support Ccarza- 
bilities amcng the 6z different geographical locations, each 
having a different mix Cie p pL Ce att oF anc teérfrinal 
reguirements. 

Tke SFLICE functional design apprcach suggests déevel- 
cping several functicnal mcdules, distributed in winicca- 
Futers threvghout the LAN with the necessary communicaticns 
to support them [Ref. 2]. This design provides fcr higter 
systen availability than the centralized approach since 
functicnal rodules can be aucved from one physical nede to 
another without changing tkeir loyical addresses [Ref. 3]. 
At the time there exist no exact methods for cdesicning 
distrikuted systems ard so an okjective of the NES research 
Frogran for SPLICE is to advance knowledge about distrituted 
systems and to increase understanding of how distrikuted 
systems aust be desicred in crder to operate effectively. 

Distrikuted systems have problems associated with their 
design that need soltticns in particular areas [Ref. 4 pp 
Z2jJ- The distributed system must provide the ability fcr the 
user tc cCcmmMunicate and access information across tke 62 
docal netucrks interccnnected by the Defense Data Netwcerk 
(DDN). It must be possible for the user at Naval Suprly 
Genter (NfC) Oakland to access the Inventory Contrel Foint 
(ICP) datakase at Mechanicskurg in the same way as the local 
Gatakase at Oakland [fef. 4}. 


11 


The cata dictionary must ~revide Suwpeerie coma 
uniguely ramming and identifying objects in the 
SPLICE systen. In tte case cf a message which is 
to ancther local netnork, the dictiopary = cane 


cktain tke physical destinaticn address with the 
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Figure 1.1 Netwcrk Services Directory and Dicticnary. 


Sessicn Services module (Figure 1.1) “ FOr OO ject Laning 
and addressing and scftware maintenance, the data dicticnary 
can kelp ty stcrincg all the name-to-address mapring and 
moet ngG inifcrmation. The data dictionary can alsc Le used 
to specify task sreguirenents for the user terzinal 
Frocesses. The data dictionary in a distributed ervircrment 
will cccpérate closely with the session services tcdule 
which prcvides assistance to the user terminal precesses in 
carrying cut their tasks. Thus a distributed operating 
system must provide, in addition to other functicns, the 
ability tc access effectively the dicticnaryydirectcry 


systex (Figure 1.2 frem Ref. 4) . 


Major systems of the SPLICE application eénvircnmrert are 
the Intecrated Disbursement and Accounting {IDA), Autonated 
Frocurement and Date Entry ({AFALDE), Uniform Automated Data 
PEeOcessing System-Stcck Points (UADPS-SP), and tLIcgistics 
Pata system Trident ILS. Fach of the above systems has its 
Ccwnh €lements, files, programs, transactions, users and 
reports [Ref. 4]. 

It ig vital for the system tc manage all the resctrces 
efficiently and the Gistrikuted environmert makes this job 
more difficult. A cata dicticnary/directory system (DDS) 
seems tc ke one apprcach to data design and managing grckilen 
soluticn. For the céntralized database environmert three 
aspects are emphasized [Ref. 5°. 


-Ire scftware interfaces between the D/D syster and 


cther scfitware packaces 
~The ccnvert functicns cf the DyD systen 


~The environmentai dependency between the D/D system and 
a datakase managemert systea ({IBMS). 

Fer the distributed datakase environment, as in the case 
cf SFIICE, there must be extensions to the centralized D/D, 
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additicnal software interfaces required, andthe use c 


L/D as a distributed catabase. 
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Decl otereves OF THESIS 


Dhcemce LEC UE Onjcct dt —the milaval Postgraduate S$chcol 
ilps) takes the apprcach of designing the lcgical or virtual 
Iccal Area Network (1AN) ‘farst, specifying all the furce- 
tional mcdules, theixr characteristics and the communication 
Frotcccls, rather thar focusing on the hardware characteris- 
tics of IAN first [Ref. 1] developing alternatives fer 
SPLICE Iccal Area Nétworks. Heber phOvi Ginga Lunctional 
specificaticn fcr a distributed operating systen, user 
interface specifications are provided, where the 
dicticraryycirectory system (DDS) constitutes a majcr ccmypo- 
Meme [Ret. 4) and its Euncti¢cn iS to provide suprert fcr 
Famine ard identifyirg objects in SPLICE. 

Tke ckjectives cf this thesis are tc investigate the 
area cf data dictionary/directcry systems (DDS), to cutline 
the advantages/disadvantages of these systems, arc to 
present the underlyirg ideas. Also, to fay special attention 
to tke distributed envircnment, and to aintrcduce the 
Eenefits fcr the SPIICE system from uSing a dicticnary/ 
directory system. Finally an attempt will be made tec intrec- 
duce tke interface reguirements between - a data 
dicticnaryydirectory system fcr the SPLICE, and the neéigh- 


Ecrirc mcdules. 


Aw GENEFAI REVIEW 


A cata cicticnary is a description cf data r= securceelea 
contains Ecth machirée-readable and human-readable descrir- 
tions of the database tables, their attributes, interrela- 
tionshiys, and semantics. It is usually not very large, [rut 
it has avery frich structure. Most systems have a data 
dicticnary facility wkich stores metadata about the database 
aside frcem the datakase itself. The data dicticnary 41s 
cften Ftuilt on tcp of the DEMS as a Special application ia 
a special cata definition language. 

Thus a [DS is a set of oneé or more databases cecntaining 
Cata akcut an organization's infcrmaticn rescurces. Incse 
Lesources can be retrieved and analyzed using standard data- 
kase management system (DBMS) capabilities. The ccncée;t veg 
a data dictionary system has existed in the data [Iccéssinag 
industry fcr a numker of years. Use of such a systen 
consists, Lasically, cf an attempt to capture and Store fag 
central location definitiors cf data and other eErtriesuce 


interest [Ref. 6}. Tke frincigles of such a system are: 
-Ercvide for better data ccntrol 
-~Frcvide for better documentation 


-Improve the yuality of the systems that are fEFuailt in 
terms of user functionality and satisfaction and systen 
faintainakility. 

The cata dictionary helps to capture and document data 


elemerts, their definitions and some of their descriptive 


1€é 


attributes. It alsc provides for logicGelbe-grouping cf data 
elements dtring the frocess cf gathering requirements to 
Eruild a rew system. Tke data element dictionary prcvides the 
vocaktulary that can Le used LEetween tne systems analyst and 
the erd-user [Ref. 6°—. 

Next in the spectrum of usage the DDS help is twefcld. 
to include inforgaticr of hcw and by whom tne data elerents 
can re used. Thus a dictionary can be used to store the 
defiriticns of data elements and tne definitions ci ctier 
data ccrstructsS (records, files), the deriniticns of 
Frocéesses ([Lrograms cr manual processes), and definiticrs of 
data users (individuals, orgarizations). The Seccrd trend 
that ccntributed to this extended usage ofa dicticnarv 
system was the gradual migraticn away from the use cf tradi- 
tional files toward the concert of a central, integrated 
Gatakrase distributed across the DDN but centraiized within 
€ach IAN, under the ccatrol of a database management systen. 

Ihe greklem cf duplication cf data (data redundarcy) can 
ke scived inside each LAN Eut another mechanism rust te 
Frovided ir order tc solve that predlem across the DOIN. 
This -rcekler must be examined carefully and that g¢chanisn 
must ;rovide for eccromy because sometimes data redundancy 
may Le mcre cost-efficient than the freguent use of ICN. 

The akcve is vital for system design because in the 
SPLICE environment, data are to be shared not cnly by 
different systems, rut alsc Ey a wide range cf users. The 
Tasic ccncerft of a LEMS is te frovide a centrally lecated 
set cf definiticns cf data within each LAN that is to be 
shared in crder to assure that different users will accéss 
commcr data with a set of ccnsistent definiticns. 

Ihe LDS acts as a frepositcry of ali definitive inforna- 
tion akort the database such as characteristics, sreélaticn- 


Ships, énd access authorizations. These databases, as 
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implied Ly the term ‘legically' (“cam -be=iysical2, <1.) oe 
diverse lIlccations within each LAN but are lcgically linked 
vila ccmm@upications anc the DDS. 

Tke cata dictionary system located in a nede withir each 
IAN can fe used to yrevide the above definitions and tius 
the reguired data corsistency. 

Separating the data dictionary from the datalase raises 
two prckElens [Ref. 75. 


-The dictionary and data base may disagree swith cne 
another triess ore interface has control of both functicns 

-Eaving a separate data dictionary amplies having a 
separate language fcr the definition and manipulaticn cir the 
dicticnary catabase. 

Users who define tables and other objects (case of 
systema-R) are encouraged to include English text to describe 
the meanings of the ckjects. Later other users can retrieve 
attrikute tables with certain attributes or can brcwse amcng 
the descriftions of defined tatles, if they are so authem. 
ized. A user later can modify these entries to change the 


attrikutes cf an object. 


EF. MAXACEMENT OF INFCRMATICN RESOURCES 


Infcruation resourse managenzent (IRM) is a methcdolcgy 
that attempts to solve a set oi problems related tc the 
system life cycle ir an integrated and coordinated manner. 
The data dictionary system will play an important tecle in 
this areé. 

In the case of SEIICE the DDS can play an inperranteae 
in ErOovraing <a dccumented INVeEnEOLy oz infcrmation 
rescurces, a ccrtrel mechanism for the analysis ard design 
cf new infcrmation resources and the necessary rescurce 


indefendence. 


ks 


A data dictionary can be used as a fFowerful tool (nct as 
a sclutticn) that can aid in the solution to various grectlems 
Wen d= the ZNHVENtCry Contrci, report production, f[roj,er 
Mowting cL cata, preper routing of reguests, data ccnsis- 
tency, s€curity, etc. 

Finally the dicticnary system project is in fact an 
Infornmaticn Resourse Management (IRM)! project. The SELICE 
system fessesses much valuatie data that has been generated, 
collected, and stored in an automatic and ‘fcrmated' state. 
Utilizaticn of any class of data involves oné€é cr ucre 


Frocesses. These are [Ref. 6] 


= cCllection: It iS a frecess that tends to ce €xpen- 


Sive as the cost of identification and recording ({irciudaing 


input tc an automated system, as necessary) can be high. 


- Frcecessing: Tke data ccllected is yenerally ‘tanaged' 





in scnue fashion pbefcre and/or after being stored. ir tne 


case of automated cata, this cccurs through the use of 


Gemeuter Ercgranms. 


Pes LOLdGGs The repositcry of data and inftcrmation 


termec a "data base", 


- ketrieval: Using the knowledge about the stcrage 
technigue Féing used, data are retrieved to answer questicns 


cr tc ke mcdified. 


- Ccumunications: A Cc@mrunication line is needed to 


connect tke user terminal with the place where the 


dicticnary resides. 


'Inicrmation RKescurse Management is. whatever eee 
acticn, cr procedure concerning information (both au ciated 
and ncn-autcmated) akich management establishes that serves 
the cverall current and future needs cf the systenr., Such 
Olecics, €tc. weuld include considerations of a2 es Se 
metiness, accuracy, integrity, privacy, security, audit- 
ability, cwnership, use, and ccst effectiveness [{[ Ref. 61. 
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The €nvironment ir which the acove processes take rlace 


is ccagesed of : 


= Dataweand IntOmiarron. Represents the core of tne 


entire irfcimaticn precessing spéctrun. 


- The users in tke system. It 1S the personnel invclved 


with the system. These are users of data and other infcria- 


tion ccpyrcnents. 


- Fhysical facilities. Computer hardware and ctker 


Physical devices used in data precessing. 


ities. These are ali the activities 


which take ~lace in tke use of physical facilities. 


- Supecrt ~faciliirvoce All the services which are 
reguired ty users of cata as well as personnel whose respcn- 
sibilities are primarily in the information systems area. 

Rach. or the akove ccmfonents 1s refered as an 
Informaticr Resource and the ccmputer systez must frevide 
for an integrated ard cocrdinated manner to manage the 
entire irfcrmaticn resource of the SPLICE system anc the 
Gata Cictionary has tc play a sajor role inh conjunction wien 


the datatase management module. 


Co “SQFFCHT OF SYSTEE LIE erere 


In this section, we present some highlights cf hcw the 
Gata dictionary sufrforts the main sters of systen 
develcyment. 

Tke waterfall mcdel of the scftware life cycle [Ref. 14] 
consists of the fcllowing stages: system feasitility, 
reguirements specification, froduct design, detail design, 
coding, integration, implementation, operations and mainte- 
nance. Cf course tkere are also other modeis of a scftware 
lafe cycle rut basically the functions of a DDS are the sane 
in whatever model we consider. 
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Luring the systenr's feasibility stage the DDS can Te 
used for new data element ccllection and to avoid redurdan- 
cies ard inconsistercies. Diccmecic =DDS Canw™ ccrtain 4 
descripticn cf processes that are aiready availabie ard to 
help in assessing tke true magnitude of the proposed task. 
Turinc the requirements specification stage, tke data 
Gicticrary can provide the means to detect existirg inaccu- 
racies ir definitions and tc correct them Lefore the systen 
cperaticr. This is because the DDS contains the cverall 
scope of the requirements tc be specified. 

[Luring the product design and detail design stages, the 
CDS can Eelzp because it contains the design details cf Ecth 
data and [rccessés, which can ke shared by all w@euarers ci 
the design tean. Particularily in database design the [DS 
can record rultifle tser views, pass output from the lccical 
desicr ~kase to physical design phase, generate rultifle 
designs fcr benchmark testing, and verify the existing 
cconversicns of data in the systen. Fer the rest cf the 
stages tte DDS can Ttelp in data collection, coding, and 
testing, by providing any desired degree of coordinaticn and 
contrcel cver tasks, generating data structures, stcring 
instructions for the staff, describing the various jcks and 
activities, and finally, providing a means for effective ana 
consistent modificaticn of the system. 

Additicral benefits that can be derived from the [DS 
[Ref. 6] aré€ naming standards, aid to auditiny, interfaces 
tc application procram developrent tools, and software 
configuraticn management. A BDS allows a systegr tc ke 
extenced trcugh the acdition of new entity types, relaticna- 
ship types, attribute types, and also can te used tc ada 
conficuraticn entity types such as requirements speciafica- 
tions, change nctices, etc. The major advantage fren the 
use of tke (DS is in the case of an active system where the 
system nct cnly records the entities, but also ccntrcels how 


they are revised. 


I. CATIA CICTIONARY-SYSTEM CRGANIZATION 


Tke crganizaticnal structure for a DUS tha cece 
adopted must be comuensurate with the size of the activity 


at any cne time. Such.a structure is displayed in Figure 
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Ficure z.1 SPLICE Data Admin. Function Organizaticn. 

Zonas 

Tke Lata Adgfinistrator is the person responsitle itor 
articulatirc the data policy after the major guidelines have 
keen laid dcwn by the designirg tean. That policy includes 
Flanning fcr data ccllection, its structuring, itS Steno 
and its quality ccaticl- For tne SPLICE system the Data 
Administratcr can be aperscn or ateam located ir any 
Flace, shcse main furction will be the setting of the above 
Colic y. 


Ww) 
Le) 


Tre Dactionary Administrator who the fperscnsom teap 
responsitle for the dicticnary system within tke Data 
Paupmaecteatcce CLuUNCtICE (eg. reccrding ct all meta- iaticraa- 
tion ard meta-data and its maintenance through the use cf 
the cictiocnary syste, along with making its facilities 
availakle tc the users of this system). Because ir the 
SPLICE system the data dicticnary is unigue through all the 
system ard no different views of the data dicticnary are 
permitted in the varicus locations, that team or perscn must 
ke unigue through tke systen. Only that téam (cr ferscn) 
must have tke priviledge tc maintain the DD. The Latabase 
Administratcr whe the ferson (cr team) responsible fcr the 
technical aspects of cbtaining, running and maintaining the 
BEM S « Since SPIICE is a distributed system with datalases 
Gistributed acrcss €z different locations, the TCatabase 
Administratcr does net need to be unigue. The xreguired 
Folicy and definiticrs are setup by the data dicticnary 
administratcr and this is enough to maintain consistercy 
throuch tke whole system. The Data Quality Inspecticn tean 
has a rele also in the hierarchy, and its functior is the 
guality inspection cf the information or data, and the 
gGuality audit trail cf the whcle system. This can be cne or 
Bore tears. In the case of several teams the entire audit 
effort can te divided among then. 


EF. CCNCEFIS ON DDS SELECTICN AND EVALUATION 


itis very Gifticult to find a commercialy availakle [DS 
to meet exactly the reguirements of a system under develcp- 
fent. A selection and evaluation process comresed of 
varicus sters must be develcred in order to select the Lest 
Syste. 

Fcur steps are jfroposed Ey [Ref. 6} for the frecess of 
selecticn and evaluation of a IDS: 


~[eternine the requirements for the Gectdionaey system. . 
These ghculd be classified as €ither being mandatcry cr net. 
If net marncatory estatlish a scale and assign numbers indi- 
catinc tke importance. 

-[Tevelcr a list cf features of dictionary systems tkat 
will ke used in the evaluation cf systems. 

~[LTetermine a mapfing from the needs onto these features. 

~For €ach mapping, using descriptions of availatle 
systems, a system can be found either to guaiify or nct. 
This precess leads to eliminate systems that are tcct 
Avast ys 

ke cannct say that the akove procedure is ferfect and 
does ret have a risk for mistakes, because 1t is surjective 
and varicusly defends on the €xperience and smartness ci the 
selecticryevaluation team. Scre more common/general reascas 
leading tc mistakes are: The needS were never jrecperly 
assessed, and potential users were not asked the right gues- 
tions, unnecessary but apparently "nice" features were civen 
high values, the evaluation cf the system was inccnsistent 
kecause different pecrle evaiuate different systems withcut 
a well-defined measurement method, undue emphasis was placed 
cn features that will be needed in the future but unigper- 
tabe Low, ete. 

Fer the SPLICE system we cannot rollow the akcve froce- 
dure. SFLICE has decided to use Tandem as their "frent end" 
MiniccMmrtter. AS a result, selecting a DDS is largely a 
foregene Ccnaclusion ibs £his sievarione So we have tec use 


Tandex [LEMS and the associated dictionary capabilitié€és. 


Fe. AITDITICNAL ASPECTS OF DIS 


Ir tke next few years, several extensions to dicticnary 
systeags, nct availakle today, will most likely be ccmmer- 


cially available. These additions will allow dicticnaries 
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to ke mcre effective in interfacing with the intcruaaticn 
LesSOurces. The use cf extensibility facilities allcws an 
installaticn to custcrize the dictionary system in crder to 
make it effective in such applications. Such examyies are 
Mmemuses cl DDS t¢ ccntrol the total informaticn rescurce, to 
aid ir tte analysis, desigr and develcpment cf infcrmation 
systegs, ard to aid in efficient database design. The last 
applicatior example is the use of DDS ‘as a repositcry of 
informaticn for an entire system. This is exactly the gajor 
role the CLS has to play in the SPLICE systen. 

Referring to the SPLICE application environment the CDS 
would recuire users and analysts to define the system data 
€lemerts, files, etc. which weuld entail updating cld defi- 
riticns, discarding cutdated ones, and introducing rew cnés. 
In this way standards cf data definition and descripticn fer 
application programs can te established over the ertire 
SPLICE system [Ref. 4]. DUteOD etme n  ObneLenand iateis a 
Herculear task tc retrofit a dictionary to existing apylica- 
tion systems. Recause of the many above mentioned difficu- 
lies in inplementincg the dictionary to old aprflicaticn 
systems, we reccommend aS much acre preferable to ioplement a 
dicticrary for new ajpplicaticns only. That means that the 
Gicticnary will ke developed gradualy and a long peéericd will 
ke needed to be fully implemented for the whole SFLICE 
systeag. 

Alttcugh DDSs have many advantages, their disadvantaces 
should ke mentioned as well. TLictionary systems are ccuplex 
software systems and the execution of many dictionary func- 
tions May consume dees) GULrPicant separ tao £ the systen 
LeESOUICES. As the sccpe cf the dictionary is enlarged to 
include always larger number cf information resources, the 
DDS wiil FE€gin gradually tc look like the major rescurce 
consurer, and thus the main user of the host computer systen 


[Ref. 6]. When we consider active interfaces of the DIS, 


the 


contrcls a frocess 


Ereticus problez 


it fclloxws that this frocess 
as the dictionary system has 
time is edded to the shole 
can ke meny processes, 
accunulated service 
kottleneck. 

Tke prorosed soltticn fcr 
avoid {cr at least reduce) 
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serious. 
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finished its jogs This dedaw 
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anc tne 


in we 


the SPLICE system [ Ref. 4° can 
this overhead by locating cne 
With this simple and efficient 
of the o2 Stcck “da 


in any 


needs tec consult the local 
needs the DDS services remains 

the long queuing time acrcss 
tC oz: By 


one place 


a factor close 
the Use eo 
ei the ops, 
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and 


this assumes 


acrcss the network. 


Lz Sue 


mentioned above. A structtre 


iS prepesed in Figure 2.2 and we belaeve tkat it is less 
expensive in consumirc the system resources than tne struc- 


ture cf having different views of the master dicticnary at 
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Figure 2.2 A First [DS Hierarchical Structure for SFIICE. 


€ach IAN. In farticular suprose the copies of the lecal 
dicticnarié€s are not exact images of the master dicticrary, 
rut are different views of the master, especially views 
containing informaticr cnly for the local database. In Stch 
a case it is not useftl to separate the definitions freon the 


actual c¢atabase since the different views of the s«hcle 


KO 
a | 


catakase are centfalized withipgedch ee iE If a Sfpate pay 
for exanzyle cannct ke found in a local dataktase, ther the 
user kas tc consult tke master dictionary to find the lcca- 
tion cf the reguested Spare fart because the local CCEY veg 
the data dictionary dces not ccntain information akcut come 
cata fases of the systen. In this case the user has to 
access tle [DN twice, first tc consult the master dicticnaa 
and tken tc consult the local database in which the spare 
Fart is dilccated. This precedure can easily lead te lceng 
waiting times and firally tc "rkottleneck" Lecause the raster 
dicticrary will have to answer in guestions coming fren 62 
different LAN's. A second hierarchical structure is shcwn 
I “PAcure: 2.35. This structure involves the locaticn of a 
copy cf [DS in selected nodes instead of each node. Ey this 
hay we reduse the amcunt of secondary memory needed tc stcre 
the DyD rut we increase the use of DDN. This increase in 
use cf CDN is irversely propertional to the numker cf iyD 
replicated copies. Tke soluticn cit locating exact ccfies cf 
the master dictionary in é€ach or selected LEAN's has _ tne 
disadvantace of constming more secondary storage Tut cur 
estimaticn 1s that tkis is preferable and less expersive 
than the freguent use of DDN in order to consult the raster 
CODY < 

We cannet say that distrikution instead of replicaticn 
cf£ DES is an inefficient methcd not acceptable for SFiiae 
Since ttere 1s not enough experience fcr distrituted 
systems, and especially for data dictionaries, we have to 
examine carefully every possitle architecture, the fres and 
the ccns cf each one, in crder to make the pest déecisicn. 
But still we believe that the decision will be based mere on 
estimaticns comming from intuition and less in experience 
and statistical information. Such an architecture is Lased 
cn distriktution instead of replication of D/7D)) tors tee 
This is ghewn in figtre 2.4, and will be examined in a next 
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Figure 2.3 A Second DDS Hierarchical Structure for SPIICE. 
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Figure 2.4 A Third [DS Hierarchical Structure for SFILICE. 


He. FEATCRE ANALYSIS CF DDS 


In this section the features of DDS and a more detailed 
analysis of them will be presented. This presentaticn is a 


theoretical approach and dcés not concern any particular 


3¢ 


Syster. A costypenefit analysis can tell us whica features 
need to re included in a DDS under development. fe wis nere 
preferakle approach tktan to develop a DDS as descriked telow 


using the Tandem DBMS capability. 


1. Architecture and Implerentatio 


The relaticrship Letween DDS and DBYS will te 
addresseq tere. The f-urpose cf a DBMS is te Manage: data and 
the furpese of DDS ig to manage meta-data.2 The guesticn is 
whetrer the DDS gust fe a free~standing? or DBMS-derendent* 
syster [Fkef. 6]. 

Ihe free-starding approach 1s gocd for ccuamrercial 
systems kecause each enterfrise can evaluate the res and 
cons and reach the ojftimal decisions whether to buy cr net. 
This apprcach raises ccmpatibility proklems retween the CDS 
and tke CEMS, especially when the vendors are different 
cOmfpanies. There are many factors we have to taxe into 
acccunt sken deciding whether a DDS must be free-stardirg cr 
[BMS-~derendent. These factcre include the method of inple- 
mentaticr, the scope cf usage, whether the DDS and [BMS are 
going to ke develored together or not, and whether they are 
gcing tc ke supplied Ly the sare vendor or not. 

Cne other feature of DDS architectural structure is 


whether the DDS should be passive or active. Suppose there 
fama CClpiler, application fregran, cr cther process that 
requires meta-data fcr its execution. There should ke [DS 


availatle wkich procuces autcazatically the reyyuired teta- 
Cata. Tkis functicrality 1s sceferred to as dicticnary 


interface ard can operate in two modes: Passive where there 


2Meta-cata is the data that describes data 


3A dictionary system which does not use a DBMS ir its 
1muplegentation 


4A dictionary system which dces use a DBH£ in its inaple- 
nentaticr 


Sul 


€xists an cption cf whether the process will ££ etrieveura— 
required weta-data (tkrough tne dictionary interiace Crimes 
elsewkere) Or, in the case where the process aiready 
contains tte neta-data, there exists an cption for the 
systeg tc creck whetter this meta-data is the most current 
versicn in the dicticnary. Bere the dictionary is net in 
the cratical ™path of ca pr cecec. Active where the abcve 
cpticrs de not exist and the process always uses the mest 
current méta-data in the dicticnary. The dictionary here is 
in the critical path cf the frecess and the frocess must go 
throuch the dictionary fer the meta-data in order to execute 
EELOPECELY< 

A [TUS car cortain bcth kinds of interfaces. We have 
to keer in wind that the interfaces of the DDS system dc rot 
cnly ccncern the DDS itself, Eut also other modulés with 
which tke cictionary has tc ccoperate in order tc fairtain 
the wkcle system. 


Ze Icgical Schema, Entity Types, Relationship 


ltn 





Lictionary sckema is the term denoting the icgical 
etructure of a™dicticnary,. Structural characteristics and 
contents cf the dictionary schema determine the Kinds of 
meta-cata and the rélationships to tke established amcng 
then. Using the entity-relationship-attribute rodel 
[Ref. 6] fer the dictionary, we derine entities as fréai 
worle ckrtects or thirgs about which information Ccxi@stcue 
the cicticrary, attributes as properties (quantities or 
gualities) cf the entities, and relationships as ccnnecticns 
retween entities. 

in the DDS, resources such aS data, hardware, soft- 
ware, transacticns, personnel and documents may Le repre- 
sented, and entitiés, attributes, and reldticnhsimee 
associated with these resources must also be represented. 


Tables I througo V at the end of this Chapter tari 


da} 
A) 


[Ref. 4] indicate pessible data element attributes, file 
entity attributes, hardware entities and attributes,software 
entities and attributes, and dccument/report attrirvtes for 
the SELICE syster. 

Sitilar entities in a ODDS establish entity tyfes. 
Attrikutes can also have a degree of Similarity and in this 
case we f€pfeak abort attribute types. Panay -SlhilaL 
consideraticns agply te relationships and so we nave rela- 
tionshifz types, that are relationships Letween'entity tyges. 

schema descriptor: i ngeeed dictionary schema 
containing ail eéxistirg entity-types, pelatronsbip—-tyres; 
and attriktute-tyres, ary one cf tnem can be referred tc aga 
schena descriptor. Informaticn existing in the schema can 
indicate wkich entity-tyres are members of a given 
relaticnship-type, ard which attribute-types are asscciated 
with an entity-type cr relaticrshif-type. 

Entity-types ofa DDS can be ciassified as data 
entity-types, process entity-types and usage entity-types. 
Cn the cther hand attribute tyzes can be descriptions, clas- 
sificaticn and audit attributes created by the dicticnary to 
indicate identificaticn of the person whe created the 
entity, cate of ertity creation, madenti£icaticn cf the 
ferscn who jast modified the entity, date of latest mcdifi- 
caticn, and total number of modifications of the ertity 
[Ref. 6]. These capabilities are very useful for a systen, 
especially cne as conzlex as SFLICE. Using the akove cCafa- 
bilities rerorts and summaries can be presented on recguest, 
and alsc we can have a trace cf various interacticnrse cn the 


system tsing application programs for this reason. 


and Commands 





Interfaces mrst be included ina DDS in crder to 


allow the user to ccamunicate with the DDS via a terminal. 
miem terfminal—-DDS c¢ccumunicaticn in the SPLICE system is 


Ga) 
dat 


carried cut through tle Sessicn Services eae. This Deere 
sé€parate tcpic which will ke examined separately. In 
general an interface can be as shown in Table VI. 

Cn the other kEand ccmmands can be classified, cn the 
basis cir --theire fumetiorarre,, into various cateégcries as 
Shown in Tatle VII. 

A dictionary system can be regarded as a scftware 
Froduct that helps ir storing information akout data that 
already exists in datakases. Both DDS and DBMS deal wae 
descripticns and characteristics of data elements and with 
the Jlcgical structures obtained from these eléemerts and 
their relationships. A closely integrated dictionary systen 
and autcnated database design process have much te cffer. 
The interfaces Letween a dicticnary anda database design 
Frocéess can be divided into twe Eroad cateyories: 

~Initial data entry and editing 
~Icgical model structuring 

Aipitial data entry and editing: For Gata entry the 
data requirements information needed by autcmated database 
design precéedures is almost a complete (proper) sukset of 
the irfcrmation noragally stcred in current ccirreéercial 
dicticrary systems. For the SPLICE the files already exist 
but ethe dietionarysdce= snot Therefore the whole design of 
[IDS must provide fer initial detection and avoicance man 
duplicate entries. £ soon as the design takes care of that 
durince tke initial ste€ps, then the entry cf” inbferratias 
about raw data elements has to be made only tc the 
dicticnary system. Next an interface must exist in crder to 
allow tke design Erecedures te access information in naned 
aggrecgaticns {local views). For editing, the initial data 
entry is rarely clean in the sense that names, usacge, ard 
Characteristics cf tke data elements may not vet tke stan- 
Gardized across local views. Synonyms, homonyms ard inccn- 


sistert characteristics of the same data usually resuit wten 


data regtirements are cathered frcm different sources. Ine 
editing phases of tne automated design procedures, and the 
reports jreduced therein, can serve as an input filtering 
functicn fcr the dicticnary. When the interactive editing 
Fhases are completed, otsolete information (eg. non-Sstardarad 
names) can be removed frog the dictionary, such that the 
ainformaticn remainin¢c permanently 1s clean and consistent. 
Again, as we mentioned in a Frevious section, this can Le 
done cnly for new agprlicaticns because the tasc of retro- 
fitinc a dictionary tc existing application systems is very 
eiailacult. 

logical model structuring: The structuring ;roce- 
dure fcr initial desicn should Le able to extract fiitered, 
unstructured data element infcrmation in named aggregates 
(local views) from the dicticnary such that the ccmupcsite 
model and the derived logical designs can be generated in 
the reruoal fanner. 

For adding new reguirezents to existing desigrs and 
when fprceceéssing new functions or adding new data tc an 
existirg database, the design process shouid be akrie to 
extract from the dictionary a description of the existing 
design alicrg with tke filtered unstructured data element 
infcrmaticn for that which is new. Varicus levels of 
constraints on the freedom of structuring prccesses can be 
set here in order tc facilitate the whole design effcrt. 

Cnoce the autcnated design process is completed and a 
suitatle lcgical design has been obtained, the resuits avst 
ke stcred in the dicticnary. Assuming the unstructured data 
elemerts are already described in the dictionary, the rela- 
tionshirs defining segments, databases, logical relaticns 


and seccndary indexes would ncw ke stored. 











TAELE I 
Data Element Attributes 


Type 

Range 

lergth 

Unit of meastre 
Usage 

Language nares 
Re petitions 

&& Levels 

Key 

Cefault value 
Display formet 
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TAELFE ii 
File Entity Attributes 


File nate 

Iccations 

Size (in bytes) 

Fermat (seg, randog, £in) 
Access conticl 


AccesS security prctection 
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TAELE Ii 
Selected Hardware Entities and Attrikut2s 


Ertities 


Frecessing systen 
Secondary stcrage 
Ccamunicaticns systen 
Concentra cor 
Terminals 

LAX I/O perifherals 


tee 

Mcdel 

Mcdel number 
Serial rumber 
Mficer's numter 
SCurce 
Features 
Léescription 
Ceocu. references 
Usage by site 
Gest 


Maintenance activity 
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TABLE IV 
Selected Software Entities and Attritutes 


Frtities 


ie 4 Operating systen 
Operaticnal support system 
Environmental systen 


Arylication software 


Attributes 
PECQGEdM—2d 
Revision nugrter 
Revisior date 


Date compilec 


Fatch level 
Change level 
License 

Date released 
Frceduct numker 
Scurce 
Features 
Decumentaticr 
Usage 

Cost 
Maintenance activity 
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TABLE V 
Docusent/Rerort Attributes 


hare 

Nugber 

Ercduct numter 
Release date 
Kkéevisicn nugter 
Scurce 

Feature 
Léescription 
Cuantity 

GCSE 
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TAELE VI 
Kinds of [DDS Interfaces 


Ccamand lancuage 
Screen crierted interface 


Fixed fcrmat batch data entry facies, 


Frogramnmatic interface that allows user written 
applications programs to access the dict icnary 
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TAELESVEL 


Command Categories for DDS 


Dictionary maintenance 

Fefort and cuery 

Data structtre interface 
Extensitility 

Status related 

Security 

Dictionary ;yrecessing ccntrol 


Lictionary acministrator 
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IIIf. INTEGRATION 
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An activeS data cictionary is desiracle for the SFLICE 
systel. It is alse known [Ref. 8] that most 1icticrarqee 
fail te -reet “this “chacetrve- A prereguisite to an active 
dicticnary is a high degree of interaction between the 
Gicticnary and varicuse other software elements such as tne 
ERMS ateeli: but also including guery languages, repert 
Generatcrs, applicaticn develcyment aids, and the like. An 
architecture for a certered and highly integrated (CLS taken 
from [Ref. €] 45 sheuwk tn Fagure 2-3 

The existing dictionaries today are noticeably urinte- 
Grated, anc hence less than active. Such a Situaticn is 
shown in Figure 3.2 (taken frem [Ref. 8] ) concerning the 
IBM DEy(C déta dicticnary and related software. Netice, in 
EabtVeular, that wkereas sce Latch feeding of data i1s 
Frovided te and/cr frcem the dictionary, there ate Ne fever 
than six flaces where database definition data is stcre¢ {in 
additicr te data definitions included in actual precgrams) 
[Ref. &€j. These are : 

ihe: LEZLC dl ctionpartyemtcelt 

Tre LEC/J2ESS litkraries 

TheoCChC cory =tairary 

Tke cétabase desicn aid (LEDA) 

The GIS data definition tatrles 

The application development facility (ADF), segment 


cules in an I¥S/DC environment, or in 


SActive to .some degree because if it is too active we 
can iccse efficiency 


Wz 


development taragement.system (DMS) files in 
a CICS envircrient. 
There is no guarantee that each of these iescrifticns 
Will agree at any pcint in time. Other data dicticraries 


may have a tigher degree of integration but no one is clcse 


Data base Data definition 
gener ator 





Application 
program 


Inquiry DATA Metadata base 
DICTIONARY | 3 


ort generator Application Data base 
generator Oesign afd 
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Figure 3.1 Highly Integrated D/D Centered Architecture. 


to the degree of intégraticn suggested in Figure 3.1. A 
high level of integration is very much needed in crder to 
Suppert the advanced functicns of an active dicticrary. To 
see that Letter, consider a user who wants to know what data 


= 
dad 


is ix the catabase, craDML routine Which wants tc vceapeed 
field pricr to updating the database, or the datakase access 
system whick needs tc know if a user password is valid fcr 
updating a certain record. All tne above functicns recuire 
direct access to the cata dicticnary. 

The extent to stich a f©£S ygualifies as being "inte- 
crated" is a relative notion determined by the sccre cf its 
metacata and the way that it interfaces with cther scftware. 
The mcst ccumon use cf the term "integrated" is with refer- 
ence tc aTyD that is the sole source of netadata in the 
syster. Ike integrated D/IT 1s accessed for all references 
to meta cata. Most cf the ccmmerciaily available [LPS have 
reached a tkigh degree of integration with their eénvircn- 
ments, and this restits in multiple sources of deéscriftcrs 
withir tke systems. The DDS fermits these systems tc access 
the [,yI indirectly and convert the metadata of each systen 
to the fcrnmat reguired by the D/D [ Ref. 5]. So for example 
aDDS gight communicate with a compiler in either cf two 
ways: 

-Ey cené€rating file and record definitions 

that the compiler accerts via copy statements. 

-Ey reading source programs and creating 

transactions to load the (DS with descriftions 
cf files, records, and é€lements. 

Cne additional area which deémands investigaticn fcr the 
develciment of a succesful DDS ccncerns integrating scteras 
which describe the logical structures of all data tyres 
existirg in a distrikuted (like the SPLICE) database. Tkis 
feature ferngits the determination of a data file's lcgical 
structure as well as its identity and location, and could 
Fossikly re essential to the development of guery ana data 
model translaticn shemes. The existence of a master schema 
also fermits the Ilcgical relation of data across file 
Founcaries; then all files in the network car be considered 


as areas within a sircle large datakase [Ref. 9]. 
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Figure 3.2 IEM Data Management Architecture. 
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Es 12 FCHA TION Clr 


Trree aspects of integrated DDS in the centralizec and 
daistrikuted datatase envircnment for SPLICE® are ci Grea 
interest and must be emphasized [Ref. 5]. 

~The software interfaces 

~The ccervert functions 

-Tke environmental dependency between the DDS ané the 
Eas 

A CLS is isrtegrated with other software packages by 
facilities that: 

-Alicw direct and indirect access to the L/D 

~Autcnhatically capture the metadata used Ey ctter 
systems 

In the next three subsecticns we will examine tte three 


most interesting aspects of an integrated DDS. 


1. Software Interfaces 


A software interface permits another systez to 


access tke [TyvD either statically or dynamically. First we 


consider the Static interface, which links the Dyl "wath 
another system indirectly via the extraction of a file of 
formatted metadata. For the static interface of a DDS éenda 
LBMse fer example, the data dictionary administtratcr, 
follcwinc the specificaticns of the data administratcr, 
enters inte the DDS all fertinent transactions to define the 
datarase and the database administrator using the akrcve 


definiticns describes the datakase. After reviewing the 


6Cuc approach for. the SPLICE database and (dae 
Se ES istributicn 1S bytiwde SPLICE 1S a distrib 
system ut the datakases afte centralized within each LAN. 
Aliso tke dicticnary ccpies at each of the selected LAN'‘'s ar 
exact ccfies c£ tke master dictionary and aifferen 
dicticnary views are not De nea So the whele SEliG@s 
system can te viewed as a distributed system, but ccrcerning 
€ach Perea: IAN, the datatase and data dictionary can be 
said te follow the centralized database CnvVircnment cone ae 
So Ecth ideas of certralized and distributed envircnierts 
can ke appiied to the SPLICE with Siicght moditicarie1—- 


YE 


qgecurdcy OL this datalLase description, a colmand is géener- 
ated fcr CS that uses this descripticn to froduce a file 
Containing the DDL. The DBMS*s DDL processor ther trans- 
dates this generatec DDL intc a schena file that the Lun 
time urit cf the DBFS can access. No run-time ccnnection 
ketween the DDS and the ‘CEMS exists here; thew LESS 
Frocesscr is not executing duriny the DDS's DDL-generation 
Process. 

Static irterfaces differ somewhat, depending ufecn 
whether they interface the CDS with user-written precrams cr 
with vendcr-supplied software fpackages. Static irterfaces 
for f~rograms written in languages such as COBOL and PI/I 
Froduce file, record, and datatase descriptions for the user 
Frcgrams £frceom the data dictionary [Ref. 5}. These inter- 
faces ecnetimes feattre edit capabilities, format cptions, 
and varicus other functions tc make the interface mcre flex- 
ible. Edit capabilities may include being able tc add 
prefixes and suffixes and even to replace entire frames. 
Forméet crticns may ccrtrel indentation, level-numtker incre- 
nents, sequence numbers, and line identifiers. Inclusicn of 
varicus clauses suck as ccmnments, conditicn namreés, and 
initial values also nay te allcwed. 

Static interfaces fcr software packages, such as [DL 
Frocessors, communication acnitors, and guery fpriccessors, 
Frodtce formatted statements for those packages cr create 
specially ercoded ccntrol files for their use. 

Static interfaces are prevalent because cf tneir 
Geiity, Ccayability, and efficiency. With powerful static 
interfaces, the data administrator can guickly ckarge 
formatted metadata cr create new formatted definitions fron 
existing Dy’ entities. Pre Stattew 0/7) canwke wade ccnpat— 
ible with wany versicns of other software packages and can 
ke develcped independently cf the source code of particular 
software packages. A disadvantage to the user of a static 


interface 1s, tne extra effcrt that may be Lequiteduees 
generate anc catalog aéetadata for the DyD. 

More Significantly, the static interface itself nas 
Ne cafakilities for updating the metadata of the systems 
with which 1t apteriaccs: Without adequate synchronization 
and ccnticls, the metadata in the DDS and the metadata in 
cther systems may beccme inconsistent [ Ref. 5]. 

[Tynamic interfaces frevide direct access Ey tke IDS 
to otker software modules. This direct access is ccmnonly 
achieved via high-level interface commands that shielc the 
software package fremz the physical details of the D/yTI. Ihe 
ccimards activate standard DDS functions, soas tec select 
all ertity cccurrences that satisfy a particular condatrem 
A DIS car frovide a facility that maxes commands availarile 
through call statementS; any frogram can then access the LyD 
without kncwledge of its physical structure. Dynagic inter- 
faces ¢rcvide consistency ccentrol and capabilities icr eae 
update and retrieval. Charges to the D/D are automatically 
reflected in the next execution of any software packaces to 
which tte DyD is interfaced; ne intervening procedures are 
reguirec as with static interfaces. A software package can 
directly retrieve and update retadata stored in the CyD ii 
the user has the authority to do so, and the scftware 
packace ‘has a such ‘capani ey Otherwise tne scftware 
Fackace anc the user would crly have read authority tc the 
D7); 

Here is where special attention must be given wken 
cesicnicg a DDS for the SPLICE. We said previously, when we 
descrikec tte first and the seccnd hierarchical structure 
for SELICE, that the local copies of the SPLICE tDsS wits 
exact images of the master copy. With this approach cne can 
imagine what will happen if cne program inany cf tte 62 
LAN‘s attempts to update the metadata stored in the DCS. 


The whcle consistency of the system is gone. The local 


WE 


copies will~no lcnger re exact imayes ort the master copy and 
temy tLEGELIECNTS Canwsarise. Theycnily solution fcr the frerosed 
architecture for SPLIICE DDS is that requests for update, 
deleticn, or additior of data definitions must be routed via 
the IIN tc the node where the master copy of the [CDS 
resides. Then the data dicticnary administrator, whe is the 
cnly ferscn responsitle for CLLS maintenance, can argreve and 
Make tte requested changes in the master copy. These 
changes must then be transmitted to the various lccaticns 
where ccries of D/D reside and executed. This we Lelieve is 
the crly pfrecedure under the froposed DDS architecture thich 
can maéintair consistercy over the whole SPLICE systen. Wwe 
cannct say that this kind of operation is furely dynamic, 
rut neltter is it static. We might cali it is a hybrid 
interface functicn wterein the security and validity ctkecks 
cf the CES are always aprlied. 

The use of dynamic interfaces incurs sigrificant 
cverhead due to the size and complex structure cf DOS. 
Application development supfort aids, such as prefprcecessors, 
source fregram tanacers, and design aids generally can 
afford tkis overhead kecause response time is not critical. 
Ometne cther hand, efficiency is critical for transacticn- 
Frocéssing systems that reference the DyD. 

Io reduce the potential overhead, common gueéeri€s ray 
ke preccopiled and stcred in the D/D. Ancther technicue 
used tc recuce overhead is fcr the software package to 
retrieve all the metadata required for a transaction at 
cncé; thts future accesses for this transaction only irvclve 
memory lcckup. Table VIII from [Ref. 5] shows some tyrical 


types cf software packages interfaces for DDS. 


In addition tc software interfaces the integration 


cf a CIS into its Environment is provided Ey cecrvert 


4S 


functicns. A DDS organization has a lot Of %¢rogDrars, fe seme 
and files tc manage. The dataydata dictionary administrator 
must enccde taousands cf maintenance transactions tc cafttre 
the metacata of all these apfrlications. The convert func- 
tions cf a [DS scan scurce programs, database descriftiors, 
and telefprecessing e€rvironment descriptions and autcnati- 
cally pfreduce maintenance transactions, thus sparing the 
Gata administratcr mary hours cf manual .effort. Figure 33m 
from [Ref. £] illustrates the flow of data through a tyfical 
COMVCLTE LUnCtIOnN. 
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Figure 3.3 System Flow for a Convert Functicn. 


Inpits include the source language statements and 


tne Dyl; cutputs are a file ci transactions £0 be 2p ue 
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the [L,I maintenance rodule, (in the case cr SPLICE that 
refers tc the mMainterance mcdule of the master >opy) anda 
Leport. 

The D/D mairtenance transactions include descrir- 
tions cf databases, files, records, groups, elements and 
Frograms. The@eprime Purpose of Convert @iunctions is to 
convert retadata fren both user-written programs and fron 
Jocal CE¥S and its related ccmponents. Table IX illustrates 
in suutary the typical D/D ccnvert function transacticns. 

Four major characteristics [Ref. 5] for ccnvert 


functions are: 


The ccntent of the generated transactions where the I/D 
Mainterance transacticns created by a convert function 
usually also contains the relationshifs between data 


entities. 


file to a convert function that can be a scurce 


ene iDpu 
awa 


mEOogr Gender rary 1116. 


Tames, elect lines tc scan, select types of transacticrs to 
create, and override generaticn of some types of metacata, 
where tke akility to analyze the metadata of Source frcgrams 
can make the DDS a valuable tccl for auditing adhererce to 


software ccntrol teckrigues. 


This characteristic cf a DDS is determined [Ey its 
reliarce cn a Specific hardware configuration, an operating 
Systen, a IlIBMS, or a teleyrocessing mcnitor. Under ideal 
conditicrs a DDS must have the capability to cperate in such 
an envircnment without losing efficiency and functicnality. 


Eut scnretinmes the practice deviates from theory. 


a 


In acompletely integrated DDS the DBMS accesses 
stored datakases via the D/C. In a less integrated syster, 
the LEMS may Maintain its cwn directory file for accessing 
stored datatases. 

In the indersendent ayfrroach the DDS is ccmpletely 
autonenrcus, it dces net rely cn any particular DBMS, ané the 
(BMS mwéirtains its cr source cf metadata. 

In the DEMS applicaticn approach the D/D aprears to 
the DEMS as just anottéer datakase. The DBMS mairtains its 
Cwn mwetacata for eack database and these metadata are separ 
Tate ici tle p77 es. 

For the SPLICE system, it 1s proposed tkat the 
embedded approach be used, where the DDS is actually a 
compcnent of the DBMS'‘s. This approach prevides complete 
integraticn of the [TS. The D/D is the cnly scurce cf 
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Figure 3.4 SELICE Emkedded Approach to DDS. 


netadata. The DBMS utilities provide the D/D management 
facilities and the DEIFS uses the D/D to directly access the 
stored catakases. Ne cther directories internal or external 
exist fcr the DEMS, andthe DBMS and its facilities rely 
completely cn the D,[ for metadata. Such a sStructuteme 


Showr in figure 3.4. 


in 
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Sc for example a query processor extracts usé€r views 
Beom the DIS and the DBMS) apelires integrity ccnstraints 
specified in the DDS ty the DDS administrator before stcring 
a data element. A BdjJOL peo tt lemme y Dere, ~ that the SELICE 
designers must overccne, DSseneme fact that Che DENS ~ for 
SPLICE already exists but the DDS does not. The erkedced 
apprcach is easier ard simpler when both DDS and CEMS are 
develcred in paraliel, but this is not the case fcr the 
SrLick. Sc special attenticn and effort must Le aprlied 
durinc tte [DS develcyment fphase. 
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TAELE Vir 
Types cf Scftware Packages I D/D System 
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Mcdule 


PEPE SPErocessor 
Catalase control systen 


FILE LICcEeSsor 


Cveryyupdate Processor 


Eatch-ccde generator 


Scurce-jrogram marager 


Teleprecessing mcnitor 


Test-data generatcr 


Lesign aid 


Description 


Creates a schema file 
Run-time unit of a DENS 


Translates DML inte Ga 
statements 


Provides direct end-user 
access to stored 
databases 


Reduces the time tc 
develop a standard 
function as compared 
to a compiler-level 
language 


Provides security 
protection, data 
Compression and editing 
capabilities £or =Seuimee 
programs 


Provides the capability 
of interactive Ccmyputing 
to remote terminals. 


Creates test files _. 
and databases acccrditfrg 
to user specificaticns 


Analyzes and gener aces 
desigas of datatLkases 
O© information systems 
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Iansacticrs for DyD Convert Function 


Medule type Generated transactions 
Frogragaing Element, group, record, file, 
and sometimes Sutschema 
and process 
Tatalase descripticn Latabase, file, subschemra, 


Eelaerouship, recor dy 
group, element 


Teleprccessizrg Terminal, line, [rocesscr, 
transaction 
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Ae GENEFAL 
Tke terre "Session" is defined in [Ref. 4] as fcllcws: 


"Sessicn: All the activity (message exchange and Frecessirg) 
which takes place Letween twc or more processes for the 
duraticn cf a sirgie task (€.g. text editing or freceéssing 
cf a trarsaction file)." 

Ike session services module of the SPLICE has tc flay 
the rcle cf coordirating the activity of the other funes 
tional mcduiles and prcviding them with work instructions via 
the service codes it inserts 1p messages to the FM's. The 
Sequerce cf operaticrs may Fe data dependent or highly 
interactive, so in <scme cases, work breakdown cannct be 
completely determined in advarce by the session sérvices. 


In such cases sS€SS1iCcI Services passes control to the first 


(contrcliing) FM which 1s tc peritorn an opetacven, and 
subsecuent "calls" tc other FM's, LEvauye, take flace 
according tc processing conditions. In all cases nowever, 


sessicn services passes contrcl to the first (controlling) 
FM. However ain sone cases, all the FM's which will be 
involved cannot fre determined in advance. Session services 
retairs and maintains state information until ¢ither a 
completicn message or error message has been received fron 
THe PCeHerGIing., en In the case of a message wkick is 
destined fcr an cbject located in another network, this fact 
4s indicated in the "message type" field. The pehyeieal 
destirétion address wceuld have been obtained fpreviously from 
the cata dictionary which exemplifies the relaticrstif 


ketween session services and data dicticnary. 
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Figure 4.1 Cooferation Between SS and Functional Mcdules. 
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Sessicn services is us€d ina distributed envircrment 
and irvclves the seven layer architecture model of the ISO 
TOE distributed secworn-. The ISO seven layer architecture 
is a standard one ard involves the foliowing layers siti 


associated functions: 


Layer Ferction 

BEELIca tion User process 

Fresentation Fcrmat data the user wants it 

Sessicn Sets up sessicn between 
ccimunicating processes 

Transport ELG) £0 edmecntro 

Netwcxrk SwiteGhin Gg, eecurlnug 

Cata lirk Reliable transmission between 
twe nodes 

Fhysical Physical transmission of bits 


between two nodes 

The ccaplexity cf the SFLICE processing envircnment 
requires that user terminal frocesses be given corsideratle 
asSsistarce in carrying out their tasks [Ref. 4]. |Secsiam 
services can provide this assistance. User terminal 
Frocesseés specify task envircrments, largely by task name 
and tke assistance of the data dictionary, where necessary 
(FLqure ai. 


Ee ABCHITECTURE INTERFACES 


In tke SPLICE layered architecture, the interfaces 
ketweer the layers are critically important. In particu, 


larly we are very interested in the software interfaces 


ketween the modules which ccnmmunicate with the data 
Gicricndaby:. These acdules are the session services mcdule 
and the DEMS module. Some forms of sottware interfaces 


ketween [BMS and D/D can be fcund in the current literature 


[Ref. 5]. On the cther hard no one has yet definec the 


ae 


reguired scftware irterfaces Letween the DyD and session 
services mcdules. We believe that the above mentioned scft- 
ware interfaces must Le of the same type and closely related 
to tke interfaces [Tetween the end user andthe sé€ssicn 
services. In a certralized system where session services 
does nct exist, the end user has to interface directly with 
the CTyD, fEut ina distributed system the session services 
module acts as the saediator Léetween the end user anc the 
Mata Cicticlary. As a minimun then, the interfaces [etween 
sessicn services andthe data dictionary in a distrikuted 
Systema atst include the interfaces between end user and data 


Gicticnary in the certralized srodel. 


The interfaces between the abcve modules must be designed 
to accc@medate new mechanisms and, as far as possikle, new 
functicns when they tmay arise. AS Dew mechanisms and 
netwcrk functions come into use in the system, it is higkly 
desirable tkat previcrsly written programs continue toc sxork. 
This is achieved by designing the interfaces aprprecpriately 
and freserving then. In the seven layer architecture, 
layers 4,5,€ and 7 prcvide e€nd-to- end communication LEetween 
sessicrs.in user machines. Layers 1,2 and 3 provide ccngu- 
nication with the nodes of the shared network. 

Eecause the SPLICE system uses a modified I1S0 layered 
apprcack, the interfaces between machines need to tre defined 
in terms of the layers. Sc we will have layer headers and 
contrcl messages that are passed between the layers. The 
application programmer does not need tc know anything abcut 
these. Fer example any command language, uSing ccunands 
Simmilaxr tc GET, PUT, OPEN, CIOCSE and DELETE, can refer to 


Gata cr facilities in a distant machine. 


in 
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C. TEE SESSION Ste vVyers AOrer: 


Tkere are differences in the session services frevided 
dependincg upon type cf netwerk. In tne distributed envircn- 
ment different types or user software need different tyres 
cf sessicn services. These differences involve nct cnly the 
software Fut alse the architecture. So one set of session 
services may be frovided for cne manufacturer's architecture 
and a different set fcr another. This 1s very important for 
the SELICE Fecause tke harcware used throughout the systen 
varies. It may be sossible that services frovided acress 
the system are of different tyres. However it is desirakle 
to have ccmron sessicr Services, because this will facili- 
tate tre mainterance task. Also for interfacing furfoses 
want sessicn services to fpresent a common image tc the 
system. This can kre acccmplished by hidding necessary 
interface units from the sessicn services. In [Ref. 10 pp 
491] there is a descripticn cf possible functiors cf the 
sessicn sérvices subsystem in a distributed network. These 


functicns are generally divided into three large grcufs: 


-Functicns required when setting up or disconnecting a 
; sessicn. 

-Functicns used during the ncermal running of a session. 

~Furcticns employed when scmething goes wrong, such as a 

rcde fallure Of a pEOtecol yiolation. 

Mcre precisely these functions are divided ir the 
fcllcwing categories: 

--Assistance in é€stablishing a session 

-~=-Fasi¢é netWerEkiTnge fLunctrer. 

~-Arpplication macircinstructions 

==-PIrCGram Control fLacrerrucs 

--File access functions 

--he€covery afd CELCcE Contec! 

=~-FCIELEG and stral=laeticn 


-~-CLialcgue software 
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—--Virtual oferaticns and transparency 
==CClpaction 
--Faynernt functicrs 


--Security and aucit functions 


Te. INTERFACES 


Furncticnal interfaces between sessicn services and data 
dicticnary must permit cther scftware modules to access the 
C/D and cenvert metadata into the format required [ry tke 
DS - 

A DDE fFrovides mary functicns and features such as: 

Mainterance 

Extensibility 

Reppert processor 

Cuery Processor 

Ccnvert 

Scfitware interface 

Exit facility 

Ike software interface function must provide a fcrunatted 
Fathway erakling the [DS to j;rovide metadata to other soft- 
ware systems such as compilers and DDL processors [kef. 5], 
to retrieve informaticn from the DDS, to update infcraaticn 
where it is permited, and te cktain the restricticr froto- 
cols fcr data consistency and integrity. The sottware 
interface can generate file descriptions for storace ina 
Frogram Jliktrary, or accert the user identificaticn and 
generate a copy of tkat user's database view. It ais not 
Fossitle fer this study to describe precisely the scftware 
interfaces needed fcr the SFLICE systen. Because this 
systen is urder develcpment, fmfany aspects or the syste are 
still urkncwn and the software modules are not yet descrited 
in full detail. So, we wlll cnoly outiine some of the soft- 


ware interfaces withcut claiming that these are sufficiert 
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for the SFIICE “syeren. interfaces can Ete added tc the 
systea during the later stages of the system iife cycle and 
existing interfaces can also Ee changed or impreved as 
needed. 

Eecause COBOL is used throughout tne system, tke COEOL 
"GENERATE" command car create from the D/D fully fcrmatted 
file and record defiritions that can be stored in a likfrary 
file. iIrcluded can kre most COEFOL clauses such as 88 leveis, 
SYNCEEBCNIZEL, REVEFINGS, send ccc. The OFTION clause of 
this ccuganc can pergit changes in names, the designaticn cf 
sequence numbers, level numbers and identifiers, anc the 
inclusicr ci progragd ccmments. An example of the use of 
this ccmmand can be fcund in [Ref. 5 pp 261}. The gerera- 
tion cate, last revision date, and revision number can Le 
automatically recorded in bcth the listing and the L/TC. 

Tke cutput file can alsc ccntain jcb controi statereints 
to be ircluded on the output file. Then the output file can 
ke executed as a jck that creates and catalogs the COEOL 
metadata as a member of a Jlaikrary under control cfr any cf 
the varicus source fircgram managers. 

A [ML frocessor can be used also to interrace ELetween 
the sessicn services and data dictionary. A source [Ircgran 
triggers the DML frocesscor ty sending a service code, 
throuch the session services, and the DNL processcr intter- 
acts sith the data dictiehary/direcrory. The outpfut cf the 
[IML processcr is an €xpanded scurce program that is sent to 
a °Coeutpiler rob cence ilacron. 

Cther kinds of interfaces inciude guery fprocesscts, 
source frIcgram managers, varicus user interface facilities, 


and ctker scftware fackages. 
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Figure 4.2 Software Interface Using a DML Processcr. 
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Ve 22D IN DISTELEUTED ENVIRONMENT 


Ae IBTECLUCTION 


In tkis chapter sre will ccrsider the design and funcuwen 
cf DIS ir the distrikuted datatase environment. Some exten- 
sions tc the centralized DyD are needed in crd2r te enxatle 
it tec function effectively in a distributed environment. 

Tke distributed system is a supset of a general infcria- 
tion system. It is rot necessary for the user tc knew how 
crc where the data is stored or in what way the data will Le 
accessed Ey a progran or hcw and where the processing is 
accomslished. Unless the dictionary flays a highly active 
role in the running of the distributed systen, there is 
dittle need to try tc share cne dictionary over the ertire 
netwcrk. This is because there 1s not likely to be a large 
amount ci update activity im “a dicta one. The dicticnham: 
can ncergally be reprceduced at each node and this is the 
Fropesed sclutions fcr SPLICE. By using such an arcripeae 
ture, prcklems of updating the dictionary across the netwcrk 
can Le sclved without much cverhead. 

Cf course the jyroblez cf distributed control in a 
netwcrk is nore complex than that of the hierarchical arcki- 
tecture cf dictionary systems which has been discussed in 
chapter twce. This if one reason, in addition to the Taqckiags 
experience with distributed data dictionary systems, why we 
Eropcsed replication instead of distribution of the wedara 
Gictichaly for SPLecr: The mcre the dictionary system acts 
as €ither the ccntrcl mechanism or a repository cf ccentrol 
informaticn, the more complex the DBMS, network operating 
systems, and dicticrary system interactions become. For 


example, in the case where we want to determine the Lest 
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wOGatiem  LCcCr Eunnin¢g a query ayainst a distriktuted and 
Fartialiy replicated catabase [Ref. 6] the dictionary systen 
is reguired to retain infcrmaticn on the location cf all 
data. Indeed, this may be highly dynamic itself, anc tkere- 
fore tke line Fretween a dictionary and "real" database 
Eecomes very fuzzy. 

Gecaticn of a distributed information tresource iaflies 
that the number of kardware and software components are to 
ke designed and intecgrated inte a controlled envircnient. 
These ccgpenents in the SPLICE include several dataLases and 
dataLase management systems, user lanyuaye interfaces, data 
dicticnaryydirectory catalogue, transaction contrellerse and 
data inputycutput ccrtrol nodules. We will descrite the 
varicts system ccmponents and we will also attempt tc demcn- 
strate the integraticn of them with the international crca- 
nizaticn fer standards (ISC) communications architecture, 
and a data storage ard retrieval architecture (DSRA). 

In general, a distributed system must provide to the end 
user trarsparency, Gata sharing, data transfer, [frccess 
paans tel. OL a Lacility for corkEination of strategic, tmlara- 
gerial ard cperational reportirg. In order to do that there 
are séveral environmental ccnstraints that must be satisfied 
[Ref. 125. These are: 

Lata cc@municaticrs 

Tata stcrage and retrieval 

Metadata 

User language sufrort 

Firecess and repert management 

Infcrmation representation 

Systenp tanagement 

Intecrity 

SeCUrIt y 

Fer the SPLICE system, communication must be integrated 
with cocpéerative Erecessing of the various different 


existing scftware and hardware. In order to do that we need 
to adcress the considerations of the database interface with 
distributed system tasks. 

A distributed database 1S particularly useful te afrli- 
caticns that invelve extensive precessing in different ioca= 
tions. SELICE fits exactly in the akove concept saomeae 
airlires, tanking, retail, and malitary command and Centres 
appiiica CiGic. T he distributed database of the SPLICE can tLe 
allocated among the nodes cf the network acccrding to 
vVaricus eéxisting criteria for fragmentation. Ic daweud 
confusior in distrikuted systems two different terms are 
used : partitioned catabase which consists cf non cverlar- 
Fing suksets, and rezylicated database, which has scme data 
redundancy [Ref. 5]. Replicaticn enforces the locality and 
availakility of the Gatabase and reduces the freguercy of 
accessing tre DDN, kut recuires the DBHYS te prEcviderperm. 
sophisticated concurrency and recovery frocedures. TIc avcid 
expensive cverhead in data management, srestricticns must be 
estaklisted as to the degree cf data replication jfermitted. 
SPLICE EFelcngs in tte class cf replicated database Lecause 
the same item of tke datakase can be located in several 
locaticrs and the lccai datakases provides infornaticn for 
jtems stcred in cnly cne location. 

Ma-cr problems in the develcpment of technigues fora 
distriktuted datakase are due to Communicaticn vclumes and 
delays and to the potential for parallel processing. 
Sometires it is very difficult te apply working soluticrs to 
distrikuted data precessing which are borrowed fren the 
centralized processing concert. These solutions cften werk 
well crily in one €rvironnment and do not transfer effi- 
ciently. So excessive delays may OCCUE. Farallel 
Frocessirg also has the pctential to increase thrcughput, 
rut reguires ccmplex contrcls to synchronize ccrcurrent 


activities at dispersed sites. Because a data dicticnrary is 
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a datakase containing metadata, the same problems existing 
in distrikuted datatases alsc exist ina distributed data 
dicticnary.in [&ef. £] are described five ktasic frcectlems 
which must Le addressed in distrikuted data management: 

~The cccrdinaticr cf the DEMS with the data transmissicn 
netwcrk such that reliable delivery of messages car tLe 
ensured. 

-The decompositicr of transactions into atomic farts, 
selecticn cf nodes tc execute those parts, and centrcl of 
any aucvement of data between sites necessary tc eErccess 
transactions. 

-~The synchronization of legically related updates and 
retrievals that are ;rocessed at different nodes. 

-The detecticn ard resoluticn of conditions where a part 
c£ the catakase beccnaes inaccessible due te node ocr line 
failure. 

-TIhe maragement cf metadata describing the distrituted 
catakase and environment. This last problem refers farticu- 


larly te the data dictionary and deserves special atterticn. 


Ee EXITESSICNS TC THE DDS 


Tre rcle of a D/D in a distributed database envircnment 
is very significant Lkecausée it contains important infcrma- 
tion akcut the déescriftion of the database distributicn, tne 
characteristics cf tte nodes and other aspects of the data 
communicaticn network. Scme additional entities must be 
included in the CDS [bef. 5] : 

-Yhe database entity which describes the glokal view cf 
the détakase and incjudes attributes for relation and attri- 
Ekute ramées, validity constraints, as well as identification 
ci lecal databases. 

-Jhe fragment entity which describes porticrs cf the 
local catalase. This entity is not useful for the SEFLICE 


kecause there are not fragments of the local datakase. 
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—jhe tcpology @ventity _whiacen describes the fhysical 
conficuraticn of netscrk ccngcnents and the links Lreétween 
Thosncacs. 

-Tke nede entity which describes tne combinaticn of 
retwcrk sukcomponents at a particular site of the neétwecrk. 

-Finaily some otker entities (terminal, line, wuulti- 
plexcr, p~recessor) describing network design. 

he cannct say exactly what new entities should Le adced 
to the SELICE DDS, but at least initially, we believe thatea 
ferm cf tofpclogy and ncede entities must be included. Ihcse 
entities are needed shen nen-lccal reguests are frecessed, 
kecause the software performing transaction management reeds 
to reference the D/D te determine the lecation of the needed 
data, the user's access privileyes, the status in addressed 
nodes, etc. The interfaces needed for this purpose can be 


dynazic cr static exactly as it is in the centralizec case. 


Cs. IEE CDS AS A DISTRIBUTED LALABASE 


Eractically, the DyD, when supperting a distrikuted 
system, feccmes itself a distributed database. The ccntents 
cf the DyD nay reside at various locations. We cannct say 
that this approach fits exactly in the sSeLich wieac Ihe 
apprcach we have proycsed for the SPLICE is guite different. 
No partiticr of the [I/D is permitted. That means the LyD 
cannct ke a distributed datakase as we know it in the crig- 
inal fcrm. For the sclution proposed for SPLICE DUS, wemeag 
say tkat it is based on replication instead of distrikution 
cf the (TLS. On the ctker hand, there are scmeée other reascn- 
able sclutions which follcw more closely the distrituted 
concert. Since experience with distributed systems is rela- 
tively suall, the steps needed to reach a decisions muctmee 


taken very carefully in oder te avoid mistakes. 
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Tke designer of a DDS encounters some similar tasic 
Froblemss as does the designer of a distributed datatase. 
When we design a D/I we must determine the extent cf envi- 
tonmertal dependency Lretween the D/D and the DBMS. AS we 
said tefcre, the distributed DyD is an extensicn cf the 
centralized cne and <sc the three basic variations tc the 
type cf relationships between a DDS anda DBMS are still in 
force. In the inderendent distributed approach the [LS has 
no rurring connecticns to any portions of the DBMS ard is 
not actively or directiy used in transaction processirg by 
mue LENS. TimtiewD Ee S-applicaticusdpproach the D/D is jest 
another distributed database to the DBMS and separate data 
Management functions are not needed to handle the D/yC. The 
CBMS way Manage its cwn run time directory that is séfarate 
beam the Dy. In the empedded distributed approach tte L/D 
rrovides the run-tiue directcry for the DEMS. All the 
compcenents cf the DENYS obtain their metadata from the DyD. 
The size, location, and ccrtents of the D/D would also 
affect tke performance of other DDS functions such as fain- 


tenance, reporting, and guery [Ref. 5]. 


ieee A ECCTEL FOR A DISIRIBUTED DDS 


In tkis section we are gcing to examine a distrikuted 
model for SFLICE DDS. Its structure is shown in figure 2.4, 
and invcdves the partition cf the global DDS into different 
weewe CCLtalning infcrmaticn fcr one or more lccal data- 
kasés. These different views can be located at each oc 
selected IAN's. 

Tke glckal (cr network) dictionary is the nucleus arcund 
which all the managenent functions of a DDS are centered. 
It certains [Ref. 11] information to start every maragement 
Frocess cf the SFLICE distributed database. Pee eae teneuears 


me GCrtains; 
ae-infcrmaticn fcr the LDS design 


Cie 


-File access frogrars 
-Ictal volumes of queries for é€ach file 
-Ictal volugzes or updates for each file 
This statistical informaticn 1S very useful eEspeciagae 


for evaluating tke oftinal tumrer of redundant copies. 


E.-Infiotmaticn £cr the distribu trenmeeunerion 
~Number and types cf transmission links, their urit 
ccst, their mean utilizationmeiacror 
“heutige aeles 
-CEU werkloacs 
=[Pisk UWtilazation 
This information can help determine the optigal alloca- 
tion of redundant file cofies and of possible oferation 


Faralleliszr. 


c.-General information about data and how data is shared 
amonc tte tarious ncdes of the systen. What the numker of 


[/D ccfies is and where they are located. 


d.-Iirficrmaticn akcut existing constraints, status Cijgeae 


systeu, rede failures etc. 
€.-inictimaticn akcut data transporcapmier, 


f.-Iirformaticn related tc data used by applicaticns 
having a global vier. Such applications are for examrle 
those wkere different local databases are invclved for 
executicr. We sSaidina previous section that scmnetines 
data redundancy is freferable over the freguent use cf the 
DDN. That means infcrmation atout the sites where a ccmnyo- 
nent (1i-.€ spare part) is lccated mnust be somewhere ifa 
central position. Sc in the case where the component cannot 
ke fcurd in the local database, the user has to accesée the 
glokéel data dictionary to find tne places where the ~artiges 


ular iteg is located. 
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fe Ge able to design and run application or retrieval 
FLOogirams the gicbal [L/D must ccntain information [kef. 11] 
about: 

Tata structures 

Lata lccation 

Lata availability 

Lata accessikfility (related to security, compatiktility 
etc) 

Lata translation mars, access paths 

[Tata entities 

Ccrmcn procedures 

Events and tzeir interrelaticns 

This dictionary must be atle to answer queries akcut DB 
and LEMS's involved in a transaction and how the transaction 
can ke fcrmulated to cdtain the most efficient result. 

Iccal cictionaries include information abcut local data- 
tases ance ajplications, local data entities, local froce- 
dures, lccal interrelations, fphysical storage structures of 
local cata, access methcds, access paths, physical stcrage 
devices, and redundancy of data items. 

In [bef. 11] a structure is proposed for a distrifuted 
[/D guite different ficem the SPLICE approach. thls Struc= 


ture, as shcwn in Figtre 5.1, involves the existence cf: 


Netwerk dictionary 

Gickal external dictionary 
Glckal conceptual dicticnary 
Iccal external dictionary 
Iccal ccnceptual cictionary 
Internal dictionary 


and €ach cne of the atove perfcrms a different functicn. 


This architecture which is purely distributed, is frebk- 
ably tcc ccaplicated to be igaflemented for the SFIICE. BG 


is a thecretical model and if we try tc implement it, we may 


Tal 
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Figure 5.1 A Purely Distributed Approach for a [DPS. 
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face sericvus interface prcklems, resulting in the ‘dava 
dicticnary tecoming the main resource consumer. 

The furctions we intend tc include in the SFIICE [Ds 
will flay a major role, if we want to avoid con plex=etruc. 
ture and saturation. These functions must be the minigun 
Fossiktle needed for the proper operation of the systenr. we 
kelie€ve, in the case where the distributed instead cf repli- 
cated apyrcach will ke follcwed, the architecture shcwn in 
Figure 2.4 is the mcre ,;ractical. 
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Homtewange the akeve architecture a glokal dicticnary 
located in some node tas the rcle of maintaining ccnsistercy 
throuchctt the whole SFLICE systen. Reguests for ufpdates, 
deleticns, and additions are routed through the data 
dicticrary administrator and after an evaluation precedtre 
eae gickel dictionary is updated. Then the changes are 
transgitted to various locaticns where the local cofies are 
updated. Also ugcates are transmitted to the data 
eee CC UCT }. 

Lata directories can be located at the inventcry ccntrol 
Mmeants {ICE). In ccntrast with the data dictiorary, the 
data directory cortains glctal PoEGCENaes cn “OnLy abcur 
subject, service code, object name, and address. Ali the 
cther information is located in the glcbal and the varicus 
local dicticnaries. The data dictionary administratcr is 
respcnsikle for mairtaining the data directory, as weil. 
Lfifferent views of the glckal dictionary are located in 
varicts IAN's. Each view can serve one or more LAN‘s ard it 
is preferatle tc be located at the LAN where it is mcst 
freguently used in crder to avoid unnecessary usage cf the 
CDN. 

Khen an item is net found in the local database the user 
routes a value Iccaticn request through the session services 
{service code) to the data directory, and the data directcry 
replies with the Ilccation address. Using the frevicus 
informaticn the user can reguest and establish a session 
with the remote database where the reguested infcrmaticn 
residé€s. 
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VI. CONCIUSICNS AND RECOMMENDATIONS 


Aes SCCNCTUSIONS 


Cur ckjectives, as descriked in the first charter, were 
to investicate the area of data dicticnary/directem 
systems, ina distributed eénvironient, to outline the 
advantacesydqisadvantages of these systems, to present tne 
underlying ideas, tc examine the benefits for the SFLICE 
systenr frcm using a dictionaryydirectory system, and firally 
to deélireate tke interface reguirements between a data 
dicticnaryydirectory systegz ard other functional nocdules. 
In additicn to the akcve okjectives we discussed alsc scmne 
ideas ccncérming the crganizaticn of the data administration 
functicn, and four hié€rarchical architectures for DDS, each 
cne with a caifferent cegree of distribution. 

The first architecture is Fased on the replicaticn cf 
the Ly,D. There are no different views of the Dy, orly 
exact ccfies of one tiew lecated in each LAN. Using thas 
architecture we have 62 ceflicated copies of tue TyD (the 
same as tke number of LAN's), each containing the infcria- 
tion {metadata) abctt all SFLICE data base derfiniticrs and 
functicns residirg ir each LAN. This architecture minitizes 
access tc the DDN but has the drawback cf reguiring a Ilct cf 
seconcary storage. Ihe size cf the D/D, statistical and 
cther informatior corcerning the freyuency of using the CON, 
and the amcunt of infcrmaticn included in the D/D, all will 
have an infact on the effectiveness of this architecture. 

Tke second arckitecture which allocates tcrepflicated 
ccpies cf the DyD tc selected nodes (the most active) is 
mcre ccnsérvative. in the case of a huge dictionary, ttis 


saves a Significant amount of secondary storage, Euse 


reguires heavier use cf the DIN. Here the ize orethne [7b 
and tte appropriate recedes at which to install the replicated 
copies seriously affect the effectiveness Cz this 
architecture. 

Tke third architecture is based on distributicn cf tne 
7D « Different vierss of the D/D reside in each LAN and 
contair information crly ccncerning the locai data tase. 
This architecture invclves the use of a data directcry (we 
propose twe replicated copies, one Ilccated in each ICF). 
The use cf the data directory (which contains limited inrfcr- 
Baticn) frevides a kind of "relaticn or connection" Letween 
the varicus views. Also a glocral dictionary is needed in 
crder tec prcevide consistency and ylobal function facilities 
throughcut the systen. This architecture is more dynamic 
than the previous twce discussed so far. It has the advantage 
cf saving secordary storage Eut, on the other hand, 
increases €ven mcre tke use of the DDN. 

A fcurth architecture was discussed just tc tention 
another fossibility fcr a distributed architecture, Eut cur 
estimaticn is that it would be too expensive in system 
resource consumption for the SFLICE. 

Three envircnmental dependency oftions for the [DS 
{inderendent, ccmpléetely integrated, and UBMS dependent) 
were also discussed. The main reason for chocsinc¢g the 
embedded (IBMS dependent) approach is because the data 
Gicticnary is gcing to be used only fer the SPLICE systen 
(so the independent ajzrroach does not make any sense), and 
also the SFILICE data tase already exists. Also the entedded 
apprcach (LEMS dependent) was chosen because of the hcuacce- 
neity cf the DBMS envircnments across LAN's. The indepen- 
dent and ccupletely irtegrated approaches are too ccstily at 
this time although tke latter cculd be ifmpiemented etentu- 


ally ficn ar embedded envircnment. 


eS 


Eo) RECCEBENLTATICNS as 


Frcm the investigations rferfcrmed, we have the fclilcwing 
Bain reccmmendations for the SFLICE systen: 
4@e- The TANDEM data dictionary that alreadyeexiee. 
should re tke basis fer the SEIICE data Mailictionagge 
k.- Ite D/D should be aaplemented cnly fcr new arp liga. 
tions brecause it is a herculean task to retrofit the Dyod to. 
the existing old applications. 
C.- The embedded (DBMS dependent) approach skculd be 
trsed fer the D/D. 
de- Twe candidate architectures should be eéxanired 
further tLased on statistical and other informaticn (not 
availatle fcr the present thesis): 
-Replicated architecture (Figure 2.3) with 
selection cf nodes where each cory will reside. 
~Distrikuted architecture (Figure 2.4) witk the 
use of twce replicated copies of the data 
directory liccated at each ICP. 
€.- A [ML precesscr should be used to interface Lretween 


data cicticnary and session services. 
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TABLEM DATA DICTIONARY 


1. Cverview 


This appendix is included to mention some features 
(hopefully the gost important) of TANDEM data dicticnary, 
Since tte TANDEM DEMS will te used in the SPLICE systen. 
For amcre detailed descrirticn of the TANDEM D/C, s¢e 
Pref. 1:3. 

A cata definition language (DDL) is a language used 
ky the data dictionary administrator to describe reccre and 
file structures cf a database. After the descripticn, the 
resulting scurce file 1S infut to the DDL compiler, arc the 
[DL ccuopiler can create data declaration source language for 
catakase records in tkree languages, COBOL, FORTRAN, and 
TAL. The CDL compiler can alsc produce FUP (file utility 
progranr) file creaticn commands for database files. The 
most significant feature of DDI is its ability to create and 
Waintain a cata dicticnary. The TANDEM data dictionary is a 
set cf sever files that documents the structure and liccaticn 
cf each file in a datakase. 

The DDL prcevides facilities POT spo dein G a 
Gicticrary as the datakase it describes grows and the struc- 
ture cf the database files changes. The DDL compiler and 
the dicticnary it creates serve as aocentral efpceirt of 
contrcl cver a database. 

IANCEM defines a datakase as a collection cf files 
etructured to serve cne or mcre applications. Wher a list 
of DPI statements --a DDL source schema-- is given tc the 
[IDL ccmpiler, the ccmrpiler can produce any of the fcllcwing 


files: 
* A cata dictionary. 


ei 


* A FUR-file creation ccmmand source. 
* A cata declaraticn source for COBOL, 
FCFIRAN, CO: iat. 
* A schema ceport summarizing each record's 
structure and each file's access keys. 
The data dictionary preduced by the DDL Seccrpi vere. 
a set of files that fcrms a permanent record of the database 
schetla. Thus the database schema,stored as a <sét cf 
dicticnary files, beccnres a system resource. The dicticnary 
Gives datakase managers information about each file in the 
datakase and alse shcws how the files relate to each cther. 
After tke dictionary tas been created, the DDL coapiler can 
read the dictionary ard produce COBOL, FORTRAN, or TAL data 
declaraticn source fcr any record defined by the schema. 
The dicticnary 1s also used [ty ENFORM, TANDEM's database 


guery language and report writer. 


Ze Creating a Dictionary 


Tke data dictionary files can ce created cn any 
subvclure in the system. The subvolume that is tc cecrtéin 
the data dictionary 1s specified with the DDL DICT ccmumand 
(for example ?DICT YS1ICCKNC.CNITY ). The DDL compiver sia 
creates the dictionary files cn the yuvuantity sufkvclume of 


the ¢ SICCKNO volume, and then opens the files for access. 


Ja. LiGtionary Re pokes 


IANITEM provides DDL users with ENFORM scurce for 
twelve dictionary rejcrts. The twelve reports document all 
c£ tke LEFINITICN and RECORIT entries in the dvcetieraae 
descriking rot only tkeir structures, but how they relate to 
each ctrer as well. 

Cnce a schéena describing a database has _ tEéen 
compiled Ey the DDI compiler and a dictionary has Lceen 
Froduced, informaticn about the database can easily be 


an 


Geratteaq With a set ci TANDEM p~reyvided ENFORM queries. The 
reports freduced by tktese gueries provide: 

* Datakase documertatior. 

* Datakase analysis infcrmation. 

* Quick access tc dicticnary contents. 

The dictionary reports are produced from ENFCRM 
source tkat is available toc the user. This means tkat in 
additicn te the stancard rerforts, you can obtain custcrized 
reports, tailored tc answer specific guesticns, Ey sinj,ly 
editing the TANDEM supplied ENFORM source. The ENFCRM 


dicticnary report scurce file ccnsists of 12 yueries tkat 


Froduce 12 different reforts. Each guery iS a seéfarate 
secticn. Thus the gueries can Fe run as a complete group, 
fadivicually, Or in apy ccmkination. The 12 dicticnary 


reports are shown in Table X. 


4. Updating the Lictionary 


As the datakase changes, its dactionary car be 
updated to reflect tle changes by adding, deleting cr mrodi- 
fying DEFINITION and RECOEL entries. In Table XI is a 


Feummary cf JANDEM dictionary mcdification function. 


7S 


RE 


k7 


Fic 


k 10 


B12 


TABLE & 


Dictionary Report Summary 


Eort descrirfticn 

CTIONARY OBJECIS~- R11 describes each LEF ana 
CORD an the dictionary, giving the tine ard 
te of creation, the time and date of the 
last mcdification, and the version numker fcr 
each Cf cc. 


DEFINITICN STRUCTURE= R2 fists aliwoieuiee 
comuponert groups ana fields for each DEF in 
the dictionary. 


RECORD STRUCTURE= Re ces 
component gBOUES ana fields 
in the dietiorary. 


Cumity = {r 
Mbit 10 


all of the 
for each EECCED 


DEFINITICNS USING DEFINITIONS— 540 scnom- 
which [TEFs are referenced by other DEFs. 
The refierencitg TEFS are iztsted wavneeaen 
of its €lements that references another 
DEF and the referenced DEF's name. 


RECORDS USING DEFINITIONS—- R5 shows which 
DEFs are referenced by KECORDS. Each KECO 
is lasted with €ach of its elements tha 
references a [TEF and the referenced DEF 
name. 


DEFINITIONS WHERE USED— 26) iists  eacr eee 
that is referenced by another ofject, re it 
a DEF cr a RECORD. Tne referencing DEF cr 
RECORD is Shown in each case. 


RECORD &A&CCESS- K7 lists tke file name and 
access fone (Ecth [rimary and alternate) for 
each RECORD in the dictionary. 


RECORD CEFINITICN METHOD- R8&8 shows the gsethced 
used tc define each RECORD. The source DEF 

is listed for thcse RECORDS defined with the 
DEF IS <def name> clause. 


— 


REFORT EFADINGS~ RY lists ali of the En@ere 
rerort keadings declared for fieds and 
roups within each DEF and RECOKD in tke 
TCELIOLGLIY.« 


DISPLAY FORMAIS—~- R10 lists ali cL the ENFGrD 
display formats declared for fields and 
Foups sithin each CLEF and RECORD in the 
1ctionary. 


RECORD CCMMENTS~- R41 lists the comments that 
a ae ate preceded the defining KECCKDL 
statement for each RECORD in the dicticrary. 
DEFINITION COMMENTS~- R12 lists the coatenre 
that inzediately Ecc eae the defini tees 
stateméert for each DEF in the dictionary. 


EC 


TABLE XI 


DPretionasyeNodtireation Function 


ALL /ZLEFE 
AL 7 bECCRD 
ee tk SLES 


LEIETE/FECORL 


pel Pry 7 Cer 


MCDIFY/BECORD 


Mmenry/ FECORL 
(nitk DEF 
Chances) 


Frocedure 
Open, dictionary with ?DICT and 
compile new DEF statement. 


Open -dtetaonary with 2DICT and 
compile Dew RECORD statement 


ogee dictionary with ?DICT, delete 
adel diee@wmecnary entries tha 
reference the DEF, and then delete 
the DEF itself with DELETE. 


OpendiGgtronary with 2DICT 
and then delete the KECCRD entry 
With the DELETE statement. 


Cpen aGactyonary with 2DICT 
command, then delete all other 
FECORD and DEF entries that refe- 
rence the DEF, delete the DEF, 
Beco ra the edited DEF, and 
finally, recompile the DEF an 
RECORD statements that 
reference the DEF. 


Oren dictionary with ?DICT 
and recompile edited 
BECORD statement. 


Cpen dictionary with ?DICT 

and delete the RECORD with 

the DELETE statement. Then ; 
modify any DEF entries that néed 

to be changed, and finally 
recompile the new record statement. 
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